Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 118
Filter
1.
Sci Rep ; 14(1): 1731, 2024 01 19.
Article in English | MEDLINE | ID: mdl-38243002

ABSTRACT

A growing body of research is focusing on real-world data (RWD) to supplement or replace randomized controlled trials (RCTs). However, due to the disparities in data generation mechanisms, differences are likely and necessitate scrutiny to validate the merging of these datasets. We compared the characteristics of RCT data from 5734 diabetic kidney disease patients with corresponding RWD from electronic health records (EHRs) of 23,523 patients. Demographics, diagnoses, medications, laboratory measurements, and vital signs were analyzed using visualization, statistical comparison, and cluster analysis. RCT and RWD sets exhibited significant differences in prevalence, longitudinality, completeness, and sampling density. The cluster analysis revealed distinct patient subgroups within both RCT and RWD sets, as well as clusters containing patients from both sets. We stress the importance of validation to verify the feasibility of combining RCT and RWD, for instance, in building an external control arm. Our results highlight general differences between RCT and RWD sets, which should be considered during the planning stages of an RCT-RWD study. If they are, RWD has the potential to enrich RCT data by providing first-hand baseline data, filling in missing data or by subgrouping or matching individuals, which calls for advanced methods to mitigate the differences between datasets.


Subject(s)
Diabetes Mellitus , Diabetic Nephropathies , Humans , Diabetic Nephropathies/epidemiology , Data Collection/methods , Electronic Health Records
2.
Leukemia ; 38(1): 109-125, 2024 01.
Article in English | MEDLINE | ID: mdl-37919606

ABSTRACT

Immunological control of residual leukemia cells is thought to occur in patients with chronic myeloid leukemia (CML) that maintain treatment-free remission (TFR) following tyrosine kinase inhibitor (TKI) discontinuation. To study this, we analyzed 55 single-cell RNA and T cell receptor (TCR) sequenced samples (scRNA+TCRαß-seq) from patients with CML (n = 13, N = 25), other cancers (n = 28), and healthy (n = 7). The high number and active phenotype of natural killer (NK) cells in CML separated them from healthy and other cancers. Most NK cells in CML belonged to the active CD56dim cluster with high expression of GZMA/B, PRF1, CCL3/4, and IFNG, with interactions with leukemic cells via inhibitory LGALS9-TIM3 and PVR-TIGIT interactions. Accordingly, upregulation of LGALS9 was observed in CML target cells and TIM3 in NK cells when co-cultured together. Additionally, we created a classifier to identify TCRs targeting leukemia-associated antigen PR1 and quantified anti-PR1 T cells in 90 CML and 786 healthy TCRß-sequenced samples. Anti-PR1 T cells were more prevalent in CML, enriched in bone marrow samples, and enriched in the mature, cytotoxic CD8 + TEMRA cluster, especially in a patient maintaining TFR. Our results highlight the role of NK cells and anti-PR1 T cells in anti-leukemic immune responses in CML.


Subject(s)
Leukemia, Myelogenous, Chronic, BCR-ABL Positive , Humans , Hepatitis A Virus Cellular Receptor 2 , Protein Kinase Inhibitors/pharmacology , Protein Kinase Inhibitors/therapeutic use , Leukemia, Myelogenous, Chronic, BCR-ABL Positive/pathology , Single-Cell Analysis
3.
Bioinformatics ; 39(12)2023 12 01.
Article in English | MEDLINE | ID: mdl-38070156

ABSTRACT

MOTIVATION: T cells play an essential role in adaptive immune system to fight pathogens and cancer but may also give rise to autoimmune diseases. The recognition of a peptide-MHC (pMHC) complex by a T cell receptor (TCR) is required to elicit an immune response. Many machine learning models have been developed to predict the binding, but generalizing predictions to pMHCs outside the training data remains challenging. RESULTS: We have developed a new machine learning model that utilizes information about the TCR from both α and ß chains, epitope sequence, and MHC. Our method uses ProtBERT embeddings for the amino acid sequences of both chains and the epitope, as well as convolution and multi-head attention architectures. We show the importance of each input feature as well as the benefit of including epitopes with only a few TCRs to the training data. We evaluate our model on existing databases and show that it compares favorably against other state-of-the-art models. AVAILABILITY AND IMPLEMENTATION: https://github.com/DaniTheOrange/EPIC-TRACE.


Subject(s)
Receptors, Antigen, T-Cell , T-Lymphocytes , Epitopes , Receptors, Antigen, T-Cell/chemistry , Amino Acid Sequence , T-Lymphocytes/metabolism , Protein Binding , Epitopes, T-Lymphocyte/metabolism
4.
Immunity ; 56(12): 2816-2835.e13, 2023 Dec 12.
Article in English | MEDLINE | ID: mdl-38091953

ABSTRACT

Cancer cells can evade natural killer (NK) cell activity, thereby limiting anti-tumor immunity. To reveal genetic determinants of susceptibility to NK cell activity, we examined interacting NK cells and blood cancer cells using single-cell and genome-scale functional genomics screens. Interaction of NK and cancer cells induced distinct activation and type I interferon (IFN) states in both cell types depending on the cancer cell lineage and molecular phenotype, ranging from more sensitive myeloid to less sensitive B-lymphoid cancers. CRISPR screens in cancer cells uncovered genes regulating sensitivity and resistance to NK cell-mediated killing, including adhesion-related glycoproteins, protein fucosylation genes, and transcriptional regulators, in addition to confirming the importance of antigen presentation and death receptor signaling pathways. CRISPR screens with a single-cell transcriptomic readout provided insight into underlying mechanisms, including regulation of IFN-γ signaling in cancer cells and NK cell activation states. Our findings highlight the diversity of mechanisms influencing NK cell susceptibility across different cancers and provide a resource for NK cell-based therapies.


Subject(s)
Hematologic Neoplasms , Neoplasms , Humans , Killer Cells, Natural , Neoplasms/genetics , Antigen Presentation , Genomics , Cytotoxicity, Immunologic/genetics , Cell Line, Tumor
5.
Sci Rep ; 13(1): 15941, 2023 09 24.
Article in English | MEDLINE | ID: mdl-37743383

ABSTRACT

Better understanding of the early events in the development of type 1 diabetes is needed to improve prediction and monitoring of the disease progression during the substantially heterogeneous presymptomatic period of the beta cell damaging process. To address this concern, we used mass spectrometry-based proteomics to analyse longitudinal pre-onset plasma sample series from children positive for multiple islet autoantibodies who had rapidly progressed to type 1 diabetes before 4 years of age (n = 10) and compared these with similar measurements from matched children who were either positive for a single autoantibody (n = 10) or autoantibody negative (n = 10). Following statistical analysis of the longitudinal data, targeted serum proteomics was used to verify 11 proteins putatively associated with the disease development in a similar yet independent and larger cohort of children who progressed to the disease within 5 years of age (n = 31) and matched autoantibody negative children (n = 31). These data reiterated extensive age-related trends for protein levels in young children. Further, these analyses demonstrated that the serum levels of two peptides unique for apolipoprotein C1 (APOC1) were decreased after the appearance of the first islet autoantibody and remained relatively less abundant in children who progressed to type 1 diabetes, in comparison to autoantibody negative children.


Subject(s)
Diabetes Mellitus, Type 1 , Insulin-Secreting Cells , Humans , Child , Child, Preschool , Apolipoprotein C-I , Autoantibodies , Disease Progression
6.
Bioinformatics ; 39(39 Suppl 1): i347-i356, 2023 06 30.
Article in English | MEDLINE | ID: mdl-37387131

ABSTRACT

MOTIVATION: Signal peptides (SPs) are short amino acid segments present at the N-terminus of newly synthesized proteins that facilitate protein translocation into the lumen of the endoplasmic reticulum, after which they are cleaved off. Specific regions of SPs influence the efficiency of protein translocation, and small changes in their primary structure can abolish protein secretion altogether. The lack of conserved motifs across SPs, sensitivity to mutations, and variability in the length of the peptides make SP prediction a challenging task that has been extensively pursued over the years. RESULTS: We introduce TSignal, a deep transformer-based neural network architecture that utilizes BERT language models and dot-product attention techniques. TSignal predicts the presence of SPs and the cleavage site between the SP and the translocated mature protein. We use common benchmark datasets and show competitive accuracy in terms of SP presence prediction and state-of-the-art accuracy in terms of cleavage site prediction for most of the SP types and organism groups. We further illustrate that our fully data-driven trained model identifies useful biological information on heterogeneous test sequences. AVAILABILITY AND IMPLEMENTATION: TSignal is available at: https://github.com/Dumitrescu-Alexandru/TSignal.


Subject(s)
Amino Acids , Protein Sorting Signals , Protein Transport , Benchmarking , Language
7.
BMC Bioinformatics ; 24(1): 58, 2023 Feb 22.
Article in English | MEDLINE | ID: mdl-36810075

ABSTRACT

BACKGROUND: DNA methylation plays an important role in studying the epigenetics of various biological processes including many diseases. Although differential methylation of individual cytosines can be informative, given that methylation of neighboring CpGs are typically correlated, analysis of differentially methylated regions is often of more interest. RESULTS: We have developed a probabilistic method and software, LuxHMM, that uses hidden Markov model (HMM) to segment the genome into regions and a Bayesian regression model, which allows handling of multiple covariates, to infer differential methylation of regions. Moreover, our model includes experimental parameters that describe the underlying biochemistry in bisulfite sequencing and model inference is done using either variational inference for efficient genome-scale analysis or Hamiltonian Monte Carlo (HMC). CONCLUSIONS: Analyses of real and simulated bisulfite sequencing data demonstrate the competitive performance of LuxHMM compared with other published differential methylation analysis methods.


Subject(s)
Algorithms , DNA Methylation , Bayes Theorem , Epigenesis, Genetic , Sulfites , Sequence Analysis, DNA/methods
8.
J Clin Invest ; 133(6)2023 03 15.
Article in English | MEDLINE | ID: mdl-36719749

ABSTRACT

BackgroundRelatlimab plus nivolumab (anti-lymphocyte-activation gene 3 plus anti-programmed death 1 [anti-LAG-3+anti-PD-1]) has been approved by the FDA as a first-line therapy for stage III/IV melanoma, but its detailed effect on the immune system is unknown.MethodsWe evaluated blood samples from 40 immunotherapy-naive or prior immunotherapy-refractory patients with metastatic melanoma treated with anti-LAG-3+anti-PD-1 in a phase I trial using single-cell RNA and T cell receptor sequencing (scRNA+TCRαß-Seq) combined with other multiomics profiling.ResultsThe highest LAG3 expression was noted in NK cells, Tregs, and CD8+ T cells, and these cell populations underwent the most significant changes during the treatment. Adaptive NK cells were enriched in responders and underwent profound transcriptomic changes during the therapy, resulting in an active phenotype. LAG3+ Tregs expanded, but based on the transcriptome profile, became metabolically silent during the treatment. Last, higher baseline TCR clonality was observed in responding patients, and their expanding CD8+ T cell clones gained a more cytotoxic and NK-like phenotype.ConclusionAnti-LAG-3+anti-PD-1 therapy has profound effects on NK cells and Tregs in addition to CD8+ T cells.Trial registrationClinicalTrials.gov (NCT01968109)FundingCancer Foundation Finland, Sigrid Juselius Foundation, Signe and Ane Gyllenberg Foundation, Relander Foundation, State funding for university-level health research in Finland, a Helsinki Institute of Life Sciences Fellow grant, Academy of Finland (grant numbers 314442, 311081, 335432, and 335436), and an investigator-initiated research grant from BMS.


Subject(s)
Antineoplastic Agents , Melanoma , Humans , Programmed Cell Death 1 Receptor , Melanoma/drug therapy , Melanoma/genetics , Nivolumab/therapeutic use , Antineoplastic Agents/pharmacology , CD8-Positive T-Lymphocytes , Receptors, Antigen, T-Cell/metabolism , Melanoma, Cutaneous Malignant
9.
Bioinformatics ; 39(1)2023 01 01.
Article in English | MEDLINE | ID: mdl-36477794

ABSTRACT

MOTIVATION: T cells use T cell receptors (TCRs) to recognize small parts of antigens, called epitopes, presented by major histocompatibility complexes. Once an epitope is recognized, an immune response is initiated and T cell activation and proliferation by clonal expansion begin. Clonal populations of T cells with identical TCRs can remain in the body for years, thus forming immunological memory and potentially mappable immunological signatures, which could have implications in clinical applications including infectious diseases, autoimmunity and tumor immunology. RESULTS: We introduce TCRconv, a deep learning model for predicting recognition between TCRs and epitopes. TCRconv uses a deep protein language model and convolutions to extract contextualized motifs and provides state-of-the-art TCR-epitope prediction accuracy. Using TCR repertoires from COVID-19 patients, we demonstrate that TCRconv can provide insight into T cell dynamics and phenotypes during the disease. AVAILABILITY AND IMPLEMENTATION: TCRconv is available at https://github.com/emmijokinen/tcrconv. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
COVID-19 , Humans , Epitopes , Receptors, Antigen, T-Cell , T-Lymphocytes , Antigens , Epitopes, T-Lymphocyte
10.
Nat Commun ; 13(1): 5988, 2022 10 11.
Article in English | MEDLINE | ID: mdl-36220826

ABSTRACT

Analyzing antigen-specific T cell responses at scale has been challenging. Here, we analyze three types of T cell receptor (TCR) repertoire data (antigen-specific TCRs, TCR-repertoire, and single-cell RNA + TCRαß-sequencing data) from 515 patients with primary or metastatic melanoma and compare it to 783 healthy controls. Although melanoma-associated antigen (MAA) -specific TCRs are restricted to individuals, they share sequence similarities that allow us to build classifiers for predicting anti-MAA T cells. The frequency of anti-MAA T cells distinguishes melanoma patients from healthy and predicts metastatic recurrence from primary melanoma. Anti-MAA T cells have stem-like properties and frequent interactions with regulatory T cells and tumor cells via Galectin9-TIM3 and PVR-TIGIT -axes, respectively. In the responding patients, the number of expanded anti-MAA clones are higher after the anti-PD1(+anti-CTLA4) therapy and the exhaustion phenotype is rescued. Our systems immunology approach paves the way for understanding antigen-specific responses in human disorders.


Subject(s)
Hepatitis A Virus Cellular Receptor 2 , Melanoma , Humans , RNA , Receptors, Antigen, T-Cell/genetics , Receptors, Antigen, T-Cell, alpha-beta/genetics
11.
Bioinformatics ; 38(16): 3863-3870, 2022 08 10.
Article in English | MEDLINE | ID: mdl-35786716

ABSTRACT

MOTIVATION: Research on epigenetic modifications and other chromatin features at genomic regulatory elements elucidates essential biological mechanisms including the regulation of gene expression. Despite the growing number of epigenetic datasets, new tools are still needed to discover novel distinctive patterns of heterogeneous epigenetic signals at regulatory elements. RESULTS: We introduce ChromDMM, a product Dirichlet-multinomial mixture model for clustering genomic regions that are characterized by multiple chromatin features. ChromDMM extends the mixture model framework by profile shifting and flipping that can probabilistically account for inaccuracies in the position and strand-orientation of the genomic regions. Owing to hyper-parameter optimization, ChromDMM can also regularize the smoothness of the epigenetic profiles across the consecutive genomic regions. With simulated data, we demonstrate that ChromDMM clusters, shifts and strand-orients the profiles more accurately than previous methods. With ENCODE data, we show that the clustering of enhancer regions in the human genome reveals distinct patterns in several chromatin features. We further validate the enhancer clusters by their enrichment for transcriptional regulatory factor binding sites. AVAILABILITY AND IMPLEMENTATION: ChromDMM is implemented as an R package and is available at https://github.com/MariaOsmala/ChromDMM. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Epigenomics , Genome, Human , Humans , Cluster Analysis , Chromatin/genetics , Epigenesis, Genetic
12.
Diabetologia ; 65(9): 1534-1540, 2022 09.
Article in English | MEDLINE | ID: mdl-35716175

ABSTRACT

AIMS/HYPOTHESIS: Distinct DNA methylation patterns have recently been observed to precede type 1 diabetes in whole blood collected from young children. Our aim was to determine whether perinatal DNA methylation is associated with later progression to type 1 diabetes. METHODS: Reduced representation bisulphite sequencing (RRBS) analysis was performed on umbilical cord blood samples collected within the Finnish Type 1 Diabetes Prediction and Prevention (DIPP) Study. Children later diagnosed with type 1 diabetes and/or who tested positive for multiple islet autoantibodies (n = 43) were compared with control individuals (n = 79) who remained autoantibody-negative throughout the DIPP follow-up until 15 years of age. Potential confounding factors related to the pregnancy and the mother were included in the analysis. RESULTS: No differences in the umbilical cord blood methylation patterns were observed between the cases and controls at a false discovery rate <0.05. CONCLUSIONS/INTERPRETATION: Based on our results, differences between children who progress to type 1 diabetes and those who remain healthy throughout childhood are not yet present in the perinatal DNA methylome. However, we cannot exclude the possibility that such differences would be found in a larger dataset.


Subject(s)
Diabetes Mellitus, Type 1 , Autoantibodies , Child , Child, Preschool , DNA Methylation/genetics , Female , Fetal Blood/metabolism , Glutamate Decarboxylase , Humans , Pregnancy
13.
BMC Bioinformatics ; 23(1): 212, 2022 Jun 03.
Article in English | MEDLINE | ID: mdl-35659235

ABSTRACT

BACKGROUND: Transcription factors (TFs) bind regulatory DNA regions with sequence specificity, form complexes and regulate gene expression. In cooperative TF-TF binding, two transcription factors bind onto a shared DNA binding site as a pair. Previous work has demonstrated pairwise TF-TF-DNA interactions with position weight matrices (PWMs), which may however not sufficiently take into account the complexity and flexibility of pairwise binding. RESULTS: We propose two random forest (RF) methods for joint TF-TF binding site prediction: ComBind and JointRF. We train models with previously published large-scale CAP-SELEX DNA libraries, which comprise DNA sequences enriched for binding of a selected TF pair. JointRF builds a random forest with sub-sequences selected from CAP-SELEX DNA reads with previously proposed pairwise PWM. JointRF outperforms (area under receiver operating characteristics curve, AUROC, 0.75) the current state-of-the-art method i.e. orientation and spacing specific pairwise PWMs (AUROC 0.59). Thus, JointRF may be utilized to improve prediction accuracy for pre-determined binding preferences. However, pairwise TF binding is currently considered flexible; a pair may bind DNA with different orientations and amounts of dinucleotide gaps or overlap between the two motifs. Thus, we developed ComBind, which utilizes random forests by considering simultaneously multiple orientations and spacings of the two factors. Our approach outperforms (AUROC 0.78) PWMs, as well as JointRF (p<0.00195). ComBind provides an approach for predicting TF-TF binding sites without prior knowledge on pairwise binding preferences. However, more research is needed to assess ComBind eligibility for practical applications. CONCLUSIONS: Random forest is well suited for modeling pairwise TF-TF-DNA binding specificities, and ComBind provides an improvement to pairwise binding site prediction accuracy.


Subject(s)
DNA , Transcription Factors , Binding Sites/genetics , DNA/genetics , Position-Specific Scoring Matrices , Protein Binding , Transcription Factors/metabolism
14.
BMC Bioinformatics ; 23(1): 119, 2022 Apr 04.
Article in English | MEDLINE | ID: mdl-35379172

ABSTRACT

BACKGROUND: cfMeDIP-seq is a low-cost method for determining the DNA methylation status of cell-free DNA and it has been successfully combined with statistical methods for accurate cancer diagnostics. We investigate the diagnostic classification aspect by applying statistical tests and dimension reduction techniques for feature selection and probabilistic modeling for the cancer type classification, and we also study the effect of sequencing depth. METHODS: We experiment with a variety of statistical methods that use different feature selection and feature extraction methods as well as probabilistic classifiers for diagnostic decision making. We test the (moderated) t-tests and the Fisher's exact test for feature selection, principal component analysis (PCA) as well as iterative supervised PCA (ISPCA) for feature generation, and GLMnet and logistic regression methods with sparsity promoting priors for classification. Probabilistic programming language Stan is used to implement Bayesian inference for the probabilistic models. RESULTS AND CONCLUSIONS: We compare overlaps of differentially methylated genomic regions as chosen by different feature selection methods, and evaluate probabilistic classifiers by evaluating the area under the receiver operating characteristic scores on discovery and validation cohorts. While we observe that many methods perform equally well as, and occasionally considerably better than, GLMnet that was originally proposed for cfMeDIP-seq based cancer classification, we also observed that performance of different methods vary across sequencing depths, cancer types and study cohorts. Overall, methods that seem robust and promising include Fisher's exact test and ISPCA for feature selection as well as a simple logistic regression model with the number of hyper and hypo-methylated regions as features.


Subject(s)
Cell-Free Nucleic Acids , Neoplasms , Algorithms , Bayes Theorem , DNA Methylation , Humans , Models, Statistical , Neoplasms/diagnosis , Neoplasms/genetics
15.
Nat Commun ; 13(1): 1981, 2022 04 11.
Article in English | MEDLINE | ID: mdl-35411050

ABSTRACT

T cell large granular lymphocytic leukemia (T-LGLL) is a rare lymphoproliferative disorder of mature, clonally expanded T cells, where somatic-activating STAT3 mutations are common. Although T-LGLL has been described as a chronic T cell response to an antigen, the function of the non-leukemic immune system in this response is largely uncharacterized. Here, by utilizing single-cell RNA and T cell receptor profiling (scRNA+TCRαß-seq), we show that irrespective of STAT3 mutation status, T-LGLL clonotypes are more cytotoxic and exhausted than healthy reactive clonotypes. In addition, T-LGLL clonotypes show more active cell communication than reactive clones with non-leukemic immune cells via costimulatory cell-cell interactions, monocyte-secreted proinflammatory cytokines, and T-LGLL-clone-secreted IFNγ. Besides the leukemic repertoire, the non-leukemic T cell repertoire in T-LGLL is also more mature, cytotoxic, and clonally restricted than in other cancers and autoimmune disorders. Finally, 72% of the leukemic T-LGLL clonotypes share T cell receptor similarities with their non-leukemic repertoire, linking the leukemic and non-leukemic repertoires together via possible common target antigens. Our results provide a rationale to prioritize therapies that target the entire immune repertoire and not only the T-LGLL clonotype.


Subject(s)
Leukemia, Large Granular Lymphocytic , CD8-Positive T-Lymphocytes , Humans , Leukemia, Large Granular Lymphocytic/genetics , Mutation , Receptors, Antigen, T-Cell/genetics , T-Lymphocytes
16.
Epigenetics ; 17(12): 1608-1627, 2022 12.
Article in English | MEDLINE | ID: mdl-35246015

ABSTRACT

DNA methylation patterns are largely established in-utero and might mediate the impacts of in-utero conditions on later health outcomes. Associations between perinatal DNA methylation marks and pregnancy-related variables, such as maternal age and gestational weight gain, have been earlier studied with methylation microarrays, which typically cover less than 2% of human CpG sites. To detect such associations outside these regions, we chose the bisulphite sequencing approach. We collected and curated clinical data on 200 newborn infants; whose umbilical cord blood samples were analysed with the reduced representation bisulphite sequencing (RRBS) method. A generalized linear mixed-effects model was fit for each high coverage CpG site, followed by spatial and multiple testing adjustment of P values to identify differentially methylated cytosines (DMCs) and regions (DMRs) associated with clinical variables, such as maternal age, mode of delivery, and birth weight. Type 1 error rate was then evaluated with a permutation analysis. We discovered a strong inflation of spatially adjusted P values through the permutation analysis, which we then applied for empirical type 1 error control. The inflation of P values was caused by a common method for spatial adjustment and DMR detection, implemented in tools comb-p and RADMeth. Based on empirically estimated significance thresholds, very little differential methylation was associated with any of the studied clinical variables, other than sex. With this analysis workflow, the sex-associated differentially methylated regions were highly reproducible across studies, technologies, and statistical models.


Subject(s)
DNA Methylation , Fetal Blood , Infant, Newborn , Pregnancy , Female , Humans , Fetal Blood/metabolism , Data Analysis , Sequence Analysis, DNA
17.
Diabetologia ; 65(5): 844-860, 2022 05.
Article in English | MEDLINE | ID: mdl-35142878

ABSTRACT

AIMS/HYPOTHESIS: Type 1 diabetes is a chronic autoimmune disease of complex aetiology, including a potential role for epigenetic regulation. Previous epigenomic studies focused mainly on clinically diagnosed individuals. The aim of the study was to assess early DNA methylation changes associated with type 1 diabetes already before the diagnosis or even before the appearance of autoantibodies. METHODS: Reduced representation bisulphite sequencing (RRBS) was applied to study DNA methylation in purified CD4+ T cell, CD8+ T cell and CD4-CD8- cell fractions of 226 peripheral blood mononuclear cell samples longitudinally collected from seven type 1 diabetes-specific autoantibody-positive individuals and control individuals matched for age, sex, HLA risk and place of birth. We also explored correlations between DNA methylation and gene expression using RNA sequencing data from the same samples. Technical validation of RRBS results was performed using pyrosequencing. RESULTS: We identified 79, 56 and 45 differentially methylated regions in CD4+ T cells, CD8+ T cells and CD4-CD8- cell fractions, respectively, between type 1 diabetes-specific autoantibody-positive individuals and control participants. The analysis of pre-seroconversion samples identified DNA methylation signatures at the very early stage of disease, including differential methylation at the promoter of IRF5 in CD4+ T cells. Further, we validated RRBS results using pyrosequencing at the following CpG sites: chr19:18118304 in the promoter of ARRDC2; chr21:47307815 in the intron of PCBP3; and chr14:81128398 in the intergenic region near TRAF3 in CD4+ T cells. CONCLUSIONS/INTERPRETATION: These preliminary results provide novel insights into cell type-specific differential epigenetic regulation of genes, which may contribute to type 1 diabetes pathogenesis at the very early stage of disease development. Should these findings be validated, they may serve as a potential signature useful for disease prediction and management.


Subject(s)
DNA Methylation , Diabetes Mellitus, Type 1 , Autoantibodies/genetics , Autoimmunity/genetics , CD8-Positive T-Lymphocytes , Child , CpG Islands , DNA Methylation/genetics , Diabetes Mellitus, Type 1/genetics , Epigenesis, Genetic/genetics , Humans , Leukocytes, Mononuclear
18.
Comput Biol Med ; 143: 105268, 2022 Apr.
Article in English | MEDLINE | ID: mdl-35131609

ABSTRACT

High-throughput technologies produce gene expression time-series data that need fast and specialized algorithms to be processed. While current methods already deal with different aspects, such as the non-stationarity of the process and the temporal correlation, they often fail to take into account the pairing among replicates. We propose PairGP, a non-stationary Gaussian process method to compare gene expression time-series across several conditions that can account for paired longitudinal study designs and can identify groups of conditions that have different gene expression dynamics. We demonstrate the method on both simulated data and previously unpublished RNA sequencing (RNA-seq) time-series with five conditions. The results show the advantage of modeling the pairing effect to better identify groups of conditions with different dynamics. The pairing effect model displays good capabilities of selecting the most probable grouping of conditions even in the presence of a high number of conditions. The developed method is of general application and can be applied to any gene expression time series dataset. The model can identify common replicate effects among the samples coming from the same biological replicates and model those as separate components. Learning the pairing effect as a separate component, not only allows us to exclude it from the model to get better estimates of the condition effects, but also to improve the precision of the model selection process. The pairing effect that was accounted before as noise, is now identified as a separate component, resulting in more accurate and explanatory models of the data.

19.
BMC Bioinformatics ; 23(1): 41, 2022 Jan 14.
Article in English | MEDLINE | ID: mdl-35030989

ABSTRACT

BACKGROUND: DNA methylation is commonly measured using bisulfite sequencing (BS-seq). The quality of a BS-seq library is measured by its bisulfite conversion efficiency. Libraries with low conversion rates are typically excluded from analysis resulting in reduced coverage and increased costs. RESULTS: We have developed a probabilistic method and software, LuxRep, that implements a general linear model and simultaneously accounts for technical replicates (libraries from the same biological sample) from different bisulfite-converted DNA libraries. Using simulations and actual DNA methylation data, we show that including technical replicates with low bisulfite conversion rates generates more accurate estimates of methylation levels and differentially methylated sites. Moreover, using variational inference speeds up computation time necessary for whole genome analysis. CONCLUSIONS: In this work we show that taking into account technical replicates (i.e. libraries) of BS-seq data of varying bisulfite conversion rates, with their corresponding experimental parameters, improves methylation level estimation and differential methylation detection.


Subject(s)
Data Analysis , Sulfites , DNA Methylation , High-Throughput Nucleotide Sequencing , Sequence Analysis, DNA
SELECTION OF CITATIONS
SEARCH DETAIL
...